A FAST AND WELL-CONDITIONED SPECTRAL METHOD 
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Abstract. A novel spectral method is developed for the direct solution of linear ordinary 
differential equations with variable coefficients. The method leads to matrices which are almost 
banded, and a numerical solver is presented that takes 0(m-^n) operations, where m is the number 
of Chebyshev points needed to resolve the coefficients of the differential operator and n is the number 
of Chebyshev points needed to resolve the solution to the differential equation. We prove stability 
of the method by relating it to a diagonally preconditioned system which has a bounded condition 
number, in a suitable norm. For Dirichlet boundary conditions, this reduces to stability in the 
standard 2-norm. 
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1. Introduction. Spectral methods are an important tool in scientific comput- 
ing and engineering applications for solving differential equations (see, for instance, 
[3j m EQI El]). Although the computed solutions can converge super-algebraically 
to the solution of the differential equation, conventional wisdom states that spectral 
methods lead to dense, ill-conditioned matrices. In this paper, we introduce a spec- 
tral method which continues to converge super-algebraically to the solution, but only 
requires solving an almost banded, well-conditioned linear system. 

Throughout, we consider the family of linear differential equations on [—1, 1]: 

Lm = / and Bu = c (1-1) 
where L is an A^th order linear differential operator 

Lu — a (x)—^ + • • • + a [x)- ha (x}u, 

ax" ax 

B denotes K boundary conditions (Dirichlet, Neumann, etc.), c € and a°, . . . , a^, 
/ are suitably smooth functions on [—1, 1]. We make the assumption that [x) does 
not vanish on the interval [—1, 1]. 

Within spectral methods there is a subdivision between collocation methods and 
coefficient methods; the former construct matrices operating on the values of a func- 
tion at, say, Chebyshev points; the latter construct matrices operating on coefficients 
in a basis, say, of Chebyshev polynomials. Here there is a common belief that collo- 
cation methods are more adaptable to differential equations with variable coefficients; 
i.e., when aP{x), . . . , a^{x) are not constant (see Section 9.2 of [2]). However, the spec- 
tral coefficient method that we construct is equally applicable to differential equations 
with variable coefficients. 

For example, the first order differential equation (chosen so the entries in the 
resulting linear system are integers) 

dii 

— +Axu = and u(-l) = c, (1.2) 
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results in our spectral method forming the almost banded n x n linear system 



/ 1 -1 1 -1 1 -1 

2 -1 

2 2-1 

1 3 -1 
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We then approximate the solution to (|1.2p as 
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(1.3) 



V j 



k=0 

where Tk is the Chebyshev polynomial of degree k (of the first kind). Moreover, before 
solving ()1.3p we use a diagonal preconditioner to scale the columns and, in this case, 
the preconditioned matrix system has 2-norm condition number bounded above by 
53.6 for all n. The preconditioned matrix, just like (jl.3p . is almost banded and it can 
be solved in 0(n) operations (as discussed in Section [5]). 

We use boundary bordering. That is, K rows of the linear system are used to 
impose the K boundary conditions. For (jl.2|) . the last row of the linear system has 
been replaced and then permuted to the first row (for numerical convenience, later). 
Imposing the boundary condition in this way results in an error, analogous to the 
error in the tau method [2], which tends to zero as n — ^ oo. 

Alternatively, the boundary condition could be imposed by basis recombination. 
That is, for our simple example, by computing the coefhcients in the expansion 



{x) = cTo{x) + ^Uk4'k{x) 



k=l 



where 



(l>k{x) 



Tkix)-TQ{x) fc even 
Tkix)+To{x) fc odd. 



The basis is chosen so that (j)k{—l) = 0, and there are many other possible choices. 
Basis recombination is commonly used in Petrov-Galerkin spectral methods (see |llj). 
where the Galerkin method uses a different trial and test basis. Shen has constructed 
bases for second, third, fourth and higher odd order differential equations with con- 
stant coefficients so that the resulting matrices are banded or easily invertible [18l [19] . 
Moreover, it was shown that very specific variable coefficients preserve this matrix 
structure pO] . 

Our method will also use two different bases: the Chebyshev and ultraspherical 
polynomial bases, which are very similar to the bases considered in [71 1111 118j . but 
in our case will not depend on the boundary conditions. Basis recombination runs 
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counter to an important observation of Orszag 15 : Spectral methods in a convenient 
basis (like Chebyshev) can outperform an awkward problem-dependent basis. The 
problem-dependent basis may be theoretically elegant, but convenience is worth more 
in practice. In detail, the benefit of using boundary bordering instead of basis recom- 
bination include: (1) the solution is always computed in the convenient and orthogonal 
Chebyshev basis; (2) a fixed basis means we can automatically apply recurrence re- 
lations between Chebyshev polynomials (and later, ultraspherical polynomials); (3) 
we can easily incorporate variable coefficients; (4) the structure of the linear systems 
does not depend on the boundary condition(s), allowing for a fast, general solver (see 
Section [5]); and (5) it seems unlikely that basis recombination can be formulated in a 
stable way for very high order boundary conditions. 

Chebyshev polynomials or, more generally, Jacobi polynomials have been abun- 
dantly used to construct spectral coefficient methods. The tau-method can deal with 
variable coefficients, but results in dense matrices [M]. The spectral collocation 
method, as implemented in Chebf un, automatically resolves a solution to a differ- 
ential equation, but suffers from ill-conditioning [84 . The dual-Petrov-Galerkin spec- 
tral method is efficient, but not for general variable coefficient differential equations 

[HIIIIIIH]. 

We mention that the solution of linear equations with variable coefficients is 
absolutely critical to the solution of nonlinear problems. While we focus on linear 
equations in this paper, numerical experiments confirm that the approach is applicable 
to nonlinear problems as well, continuing to achieve stability. However, the diagonal 
form of variable coefficients in collocation methods may prove computationally more 
effective for certain problems, though at the expense of accuracy. 

This paper is organised as follows. In the next section we construct the method for 
first order differential equations and apply it to two problems with highly oscillatory 
variable coefficients. In the third section we extend the approach to higher order 
differential equations by using ultraspherical polynomials. We also present numerical 
results for challenging second and higher-order differential equations. In the fourth 
section, we prove stability and convergence of the method in high order norms, which 
reduce to the standard 2-norm for Dirichlet boundary conditions. In section five, 
we present a fast, stable algorithm to solve the almost banded matrices in 0(m^n) 
operations; and in the final section we describe directions for future research. 

2. Chebyshev Polynomials and First Order Differential Equations. For 

pedagogical reasons, we begin by solving first-order differential equations of the form 

u' {x) + a{x)u{x) = f {x) and w(— 1) = c (2.1) 

where a : [—1,1] — > C and / : [—1,1] — t- C are continuous functions with bounded 
variation. The continuity assumption ensures that ()2.1|) has a unique continuously 
differentiable solution on the unit interval [16J. In practice, one could solve such 
equations using quadrature on the integral formulation of the problem. 

Suppose that g(x) is a continuous function with bounded variation on [—1,1]. 
Then g has a unique representation as a uniformly convergent Chebyshev expansion 
[El Thm. 5.7]. That is. 



00 

gi^) =^9kTk{x) 

k=0 



(2.2) 
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where 

9k = - —==-dx, k>l and go = - dx. (2.3) 

One way to approximate 5(2;) is to truncate (I2.2p after the first m terms, to obtain 
the polynomial 

m — 1 

5trunc(a;) = ^ gkTk{x), (2.4) 

/c=0 

assuming we know, or can calculate, gk- A second approach is to interpolate g{x) at 
m Chebyshev points, to obtain the polynomial 

m — 1 

5interp(a;) = ^ gkTk{x). (2.5) 

fc=0 

The coefiicients {^fc} and {gk} are related by an aliasing formula stated by Clenshaw 
and Curtis [5J. Usually, in practice, g{x) will be many times differentiable on [—1,1], 
and then (|2.4I) and (|2.5p converge uniformly to g{x) at an algebraic rate as m — 00. 
Moreover, the convergence is spectral (super-algebraic) when g is infinitely differen- 
tiable, which improves to be exponential when g is analytic in a neighborhood of 
[—1,1]. If g is entire, the convergence rate improves further to be super-exponential. 

Mathematically, we seek the Chebyshev coefficients of the truncation, (12. 4|) . of 
a{x) and f{x) which are defined by the integrals (|2.3I) . However, we use the coefficients 
in the polynomial interpolant as an approximation, since coefficients in (12. 5p are faster 
to compute and match to machine precision for large enough m. The degree of the 
polynomial used to approximate a{x) will later be closely related to the bandwidth 
of the almost banded linear system we form to solve the differential equation p.ip . 

The spectral method in this paper solves for the coefficients of the solution in a 
Chebyshev series. In order to achieve this, we need to be able to represent differenti- 
ation, u'{x), and multiplication, a{x)u(x), in terms of operators on coefficients. 

2.1. First Order Differentiation Operator. The derivatives of Chebyshev 
polynomials satisfy 

dJl^ikCl}\ k>l 
dx k^O 

where C^"'_\(a;) is the Chebyshev polynomial of the second kind of degree A: — 1. We 
use C^^^ instead of U to highlight the connection to ultraspherical polynomials, which 
are key to extending the method to higher order differential equations. 

Now, we use ()2.6p to derive a simple expression for the derivative of a Chebyshev 
series. Suppose that u{x) is given by the Chebyshev series 

00 

w(^) = ^UkTk{x). (2.7) 

k=0 

Then 



^'(^) =^kukCi^}iix). 

fc=i 
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In other words, the vector of C^^-* coefficients of the derivative is given by "DqU where 
Vq is the differentiation operator 



/O 1 



V 
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(2i 



and u is the vector of Chebyshev coefficients for u{x). Note that this differentiation 
operator is sparse, in stark contrast to the classic differentiation operator in spectral 
collocation methods 

2.2. Multiplication Operator for Chebyshev Series. In order to handle 
variable coefficients of the form a{x)u{x) in (j2.ip we need to represent the multipli- 
cation of two Chebyshev series as an operator on coefficients. To this end, let 



and 



k=0 



Multiplying these two expressions together we obtain 
a{x)u{x 



oo oo 



^^ajUkTj{x)Tk{x) = ^CfcTfc(a;), 

j=0 fc=0 /s=0 

and desire an explicit form for the c^'s. By Proposition 2.1 of T] we have that 

/c 



Cfe 



{, 1 v^oo 
aouo + 2 2^1=1 am 
1 v^fc-l 
2 2^1=0 O-k-m 



aaUk 



i Si^i + 5 Z];^o k>l. 



(2.9) 



This can be represented by a Toeplitz plus an almost Hankel operator since c 
A^o[a]u where 



Mo[a] = 



At first glance, it appears that the multiplication operator and any truncation are 
dense. However, since a{x) is continuous with bounded variation, we are able to 
uniformly approximate a(x) with a finite number of Chebyshev coefficients to any 
desired accuracy. That is, for any e > there exists an m S N such that 
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fc=0 



< e. 



As long as m is large enough, to all practical purposes, we can use the truncated 
Chebyshev series to replace a{x). Recall, in our implementation we approximate 
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this truncation by the polynomial interpolant of the form (12.51) . Hence, the n x n 
principal part of A^o[fl] is banded with bandwidth m for n > m. Moreover, m can be 
surprisingly small when a{x) is analytic or many times differentiable. 

There is still one ingredient absent: The operator Vq returns coefficients in a 
series, whereas the operator A^o[a] returns coefficients in a Chebyshev series. In order 
to correct this, we require an operator that maps coefficients in a Chebyshev series to 
those in a series. 

2.3. Transformation Operator for Chebyshev Series. The Chebyshev poly- 
nomials Tk{x) can be written in terms of the C^^-* polynomials by the recurrence 
relation 



1/^(1) 

2^1 



(1) 

k-2 



(1) 



k>2 
k = 1 
k = 0. 



(2.10) 



A more general form of (|2.10p . which we will use later, is given in [13]. Suppose that 
u{x) is given by the Chebyshev series (|2.7|) . Then, using (I2.10p we have 

CO oo 

u{x) = Y.''un{x) ^ u,c'^\x) + -u,&l\x) + 2 E"^- K^(^) - ^fe'-2(^)) 

fc=0 ' ^ 



k=2 

1 \ °° 1 

^U2 (^) + E 2 - "''+2) ^^'^ 

^ k=l 



Hence, the C*^^^ coefficients for u{x) are Sq\i where Sq is the transformation operator 



5o = 



\ 



\ 



■I 



(2.11) 



and u is the vector of Chebyshev coefficient for u(x). Note that this transformation 
operator, and any truncation of it, is sparse and banded. 

2.4. Discretization of the system. Now, we have all the ingredients to solve 
any differential equation of the form (|2.ip . Firstly, we can represent the differential 
operator as 



L :=I?o + 5oXo[a], 

which takes coefficients in a Chebyshev series to those in a series. Due to this fact, 
the right-hand side j(x) must be expressed in terms of its coefficients in a series. 
Hence, ignoring the boundary condition, we can represent the differential equation 
(O) as 



Lu = 5of . 



In the above, the vectors u and f are the vectors of coefficients in the Chebyshev 
series of the form (|2.2p for u(x) and /(a;), respectively. 
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We truncate the operators to derive a practical numerical scheme. To this end, 
define the n x oo projection operator as 



Vn = (/n,0) 



(2.12) 



Truncating the differentiation operator to Vn'Do'Pj results in an n x n matrix with 
a zero last row. This observation motivates us to impose the boundary condition by 
replacing the last row of VrJ^V^ ■ We take the convention of permuting this boundary 
row to the first row so that the linear system is close to upper triangular. That is, in 
order to obtain an approximate solution to (j2.ip we solve the system 



/ 



Uo 
Ml 



with 



Un-1 J 

( To{-l 
V 



V 



(2.13) 



ri(-i) ... T„_i(-i) \ 

J 



(Note that Tki-1) 
puted solution, 



(—I)''.) The solution u{x) is then approximated by the com- 



n-l 

E 

fc=0 



UkTk{x) 



Remark In practice, one would typically generate the spectral system by dis- 
cretizing each operator individually; i.e., instead of An we would have 

Vn-iVoV^ + (Vn^iSoV^) {rnM[a]rJ) . 

We, however, use An for simplicity, and because we can generate it explicitly: 

An - Vn-lVnVj + {Vn-lS^VZ+^) {Vn+2M[a]VZ) . 

We now use this method to solve two first-order differential equations. 

2.5. Numerical Examples. We solve the linear system (|2.13l) for progressively 
larger n, and terminate the process when the tail of the solution falls below the 
relative magnitude of machine epsilon. This automatic resolution of the solution to a 
differential equation is similar to the one used in Chebfun [8 . However, Chebfun has 
to be careful how fast it increases n, due to ill-conditioning of the spectral collocation 
matrices. Our approach does not have this drawback. 

For the first example, we consider the linear differential equation 



u'{x) -f x^u{x) = 100 sin(20,000a;2) u{-\) = 
which has a highly oscillatory forcing term. The exact solution is 



(2.14) 



u(x) 



= e 4 



100e~ sin(20,000i2)dt 
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Fig. 2.1: Left: Plot of the highly oscillatory solution to (|2.14p with the maximum of 
the solution computed and marked by a circle. Right: Spy plot of the almost banded 
linear system for n — 50. 



The computed solution, which is a 20,391 degree polynomial, uniformly approx- 
imates the exact solution to ten times machine epsilon. In Figure 12.11 we plot the 
computed oscillatory solution, and a realization of the matrix when n = 50. This spy 
plot shows the almost banded structure of the linear system. 

The approximate solution is expressed in terms of a Chebyshev basis, which is 
convenient for further manipulation, e.g. using Chebfun. For example, its maximum 
is 2.3573 (circled in Figure [OJ , its integral is 3.2879 and the equation u{x) = 1.3 has 
113 solutions. 

As a second example we consider the linear differential equation 

u'{x) + u{x) = and u{-l) = 1. (2.15) 

ax + 1 

We take a = 5 x 10^, in which case the variable coefhcient can be approximated to 
roughly machine precision by a polynomial of degree 7,350, and hence, for very large 
n, the linear system (|2.13|) is banded. The exact solution to (|2.15|) is 



u{x) — exp 



tan ^(v^a::)+tan ^(-/a) 
-v/a 



which can be approximated to machine precision by a Chebyshev polynomial of degree 
5,377. The computed solution u{x), is a Chebyshev polynomial of degree 5,093 and 



(^J {u{x) - u{x)f dx^ =2.86x10"^^ 



A plot of the solution and the Cauchy error are included in Figure 12.21 The Cauchy 
error plot confirms that the solution is, up to machine precision, independent of the 
number of coefficients in its expansion for n > 5,100. 
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Fig. 2.2: Left: Plot of the computed solution to p.isp with a = 5 x 10''. Right: 
Plot of the Cauchy error for the solution coefficients. The plot shows the 2-norni 
difference between the coefficients of the approximate solution when solving an n x n 
and [l.Oln] x [l.Oln] matrix system. 



3. Ultraspherical polynomials and higher order differential equations. 

We now generalize the approach to higher order differential equations of the form 

E«'(^)^^=/(-) on [-1,1], (3.1) 

with general boundary conditions Bu = c. We assume that the boundary operator 
B is given in terms of the Chebyshev coefficients of u. For example, the Dirichlet 
boundary operator is 



B = 



To(-l) Ti(-l) T2(-l) --A^ A -1 1 "1 
ro(l) Ti(l) 72(1) ■■■) ll 1 1 1 



while the Neumann operator is 

/r^(-i) Tii-i) m-i) ■■■\_fo 1 -4 ... (-l)'=+ifc2 . 

""-{T^ii) nil) mi) ■■■) 1,0 1 4 ... 

We continue to impose that all the variable coefficients are continuous with bounded 
variation to ensure that each coefficient can be represented by a finite Chebyshev 
series. 

The approach of the first order method relied on the three relations: differentia- 
tion (|2.6I) . multiplication (|2.9p and transformation (|2.10p . To generalize the spectral 
method to higher order differential equations we use similar relations, now in terms 
of higher order ultraspherical polynomials. 

The ultraspherical (or Gegenbauer) polynomials c'!^\x), Cp''(a;), . . . are a family 
of polynomials orthogonal with respect to the weight 

(i-x2)^-i 
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defined uniquely by 



We will only use ultraspherical polynomials for A = 1,2, 
normalizing the leading coefRcient so that 

c'^\^) = '^^' + &{^'-'\ 

where (A)?; = ^^(^^i^^' denotes the Pochhammer symbol. In particular, the ultras- 
pherical polynomials with A = 1 are the Chebyshev polynomials of the second kind, 
which we denote by C'-^^ 

Importantly, ultraspherical polynomials satisfy an analogue of ^2.6\i . as stated in 
the NIST Handbook of Special Functions [13 . That is, for A > 1, 



(A) 



'2AC[i+^^ k>l 



dx 







0. 



(3.2) 



Moreover, they also satisfy an analogue to (|2.10p for A > 1: 



\ f/^(A+l) 

\+k \yk 



(A+l) 
k~2 



(A+l) 



k>2 
k = 1 
k^O. 



(3.3) 



Suppose that u{x) is represented as the Chebyshev series (|2.7p . Then for A 
1,2,... we have, by (1^ . 



d^u(x) , d- 

- kuk 



dx^ 



dx 



A-l 



We now apply the relation 

d^u{x) 
dx^ 



k=l 

A — 1 times to obtain 

oo 
k=X 



This means that the A-order differentiation operator takes the form 

/ A times \ 
6 A 

(A-l)! A + l 

A + 2 



lA-l 



(3.4) 



In the process of differentiation, Dx converts coefficients in a Chebyshev series to 
coefficients in a series. Moreover, using p. 31) the operator which transforms 

coefficients in an ultraspherical series with parameter A to those in a series with 
parameter A + 1 is 



1 



Sx 



X 
A+l 



A_ 

A+2 

A 
A+2 



A 

" A+3 



A 

' A+4 



V 



■7 



(3.5) 



As before, we also require a multiplication operator. Since Dx returns coefficients in 
series we need a multiplication operator that performs multiplication between 
two C'^^^ ultraspherical series. 
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3.1. Multiplication Operator for Ultraspherical Series. In order to handle 
the variable coefficients in p.ip . we must represent multiplication of two ultraspherical 
series in coefficient space. Given two functions 



a{x) ~ ^fljCj'^ (x) and u(x) = u^C^' { 

j=0 k=0 



we have 



oo oo 

a{x)u{x) = ^^a,^.,c]''(x)cf (3.6) 

j=0 k=0 

To obtain a C^^) series for a{x)u{x) we use the linearization formula shown by Carlitz 
[3]. The linearization formula is 

min(j,fc) 

Cf\x)ci'\x)= J2 c^i.hk)d,l\_2si^), (3.7) 

s=0 

where 

A. . , ^ J + + A - 2,s (A),(A)j-_,(A)fc_, i2\),+k-s jj + k- 2sy. 

' 3 + k + X-s s\{j - ,s)!(fc - s)! (A),+fe_, (2A),+fe_2, ^ ' ' 

and (A)fe = '''^(^'l^y' is the Pochhammer symbol. We substitute p.7p into p.6p and 
rearrange the summation signs to obtain 

oo / oo k 

a{x)u{x) = E E E a2s+j-kci{k,2s + j - k)uk \ cf\x). (3.9) 

j=0 \k=0 s=max(0,fc-j) 

From (13. 9p the (j, fc) entry of the multiplication operator representing the product of 
a{x) in a C''*'^ series is 

k 

M\[a]j^k= E a2s+j~kC^{k,2s + i - k), j,k>0. 

s— max(0,A:— j) 

Just as before, in practice, a{x) will be represented by a finite series. That is, 

m—l 

a{x) « E H^)' (3-10) 

3=0 

and with this truncation of the C^'^^ series for a{x) the matrix VnMx[a]'Pn is banded 
with bandwidth m for n > m. The expansion p.lOp can be computed by approxi- 
mating the first m Chebyshev coefficients in the Chebyshev series for a{x) and then 
applying a truncation of the transformation operator S\-i • ■ ■ Sq. 

The formula for c^{j,k), (|3.8p . cannot be used directly to form A^A[a] due to 
arithmetic overflow problems that arise for j or k greater than 70. Instead, we cancel 
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terms in the numerator and denominator of (j3.8p and match up the remaining terms 
of similar magnitude. The following relation holds: 



j + k + X-2s 
j + k + X- s 



TT^x 



n i 



x + t 



n 



2X + j + k - 2s + t 
X + j + k - 2s + t 



j-s-l 



n 



k- s+l+t 
k - s + X + t' 



(3.11) 

t=o ■' t=0 

All the fractions are of magnitude 0(1) in size, and hence VuM-xIo^PZ can be formed 
in a completely stable way. For the purposes of computational speed, we only apply 
p. lip once per entry, and use the recurrence relation 

i + k + X- s 



c^+,{j,k + 2) = c^{j,k) X 



j + k + X 
3 - s 



X + s 

X X 

s + 1 s + 1 

2X + j + k- s fc - 
X — X — 



s + X 



1 



X + j + k- 



s + 1 



to generate all the other terms required. 

Remark In the special case when A = 1 , the multiplication operator A4 1 [a] can 
be decomposed as a Toeplitz operator plus a Hankel operator. 

3.2. Discretization of the system. We now have everything in place to be 
able to solve higher order differential equations of the form (13.11) . Firstly, we can 
represent the differential operator as 



\=1 



N-1 



■SoMo[a° 



which takes coefficients in a Chebyshev series to those in a C*^^^ series. Due to this 
fact, the right hand side f{x) must be expressed in terms of its coefficients in a C^^^ 
series. Moreover, we impose the K boundary conditions on the solution by replacing 
the last K rows of VnLVn ■ We again take the convention of permuting these to the 
first K rows. That is, in order to obtain an approximate solution to p.ip we solve 
the system 



( 1 






Ui 




1 








\ Un-1 / 







\ 



•5of 



(3.12) 



where 



Ar. 



\ 



J 



and f is again a vector containing the Chebyshev coefficients of the right-hand side 
/. The solution u[x) is then approximated by the n-term Chebyshev series: 



u{x) 



E 

fc=0 
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Fig. 3.1: Left: Plot of the highly oscillatory solution to p.l3p with e = 10^®. Right: 
Plot of the Cauchy error for the solution coefficients. The plot shows the 2-norm 
difference between the coefhcients of the approximate solution when solving an n x n 
and [l.Oln] x [l.Oln] matrix system. 
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Fig. 3.2: Plot of the condition number of the spectral systems that our method 
forms against the size of the matrix system, for: (solid) e = 1 x 10^*, (dashed) 
e = 1 X 10~^ and (dot-dashed) e = 1. The plot demonstrates boundedness of the 
2-norm condition number. The observed error out performs what is suggested by the 
size of the constants. 



3.3. Numerical Examples. For the first example we consider the Airy differ- 
ential equation 



eu" (x) - xu{x) ^ and u(-l) = Ai f - y i j , u(l) = Ai | y i j (3.13) 



where Ai(-) is the Airy function of the first kind. 

In Figure IXTl we take e = 10^^ and plot the computed solution which is a poly- 
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Fig. 3.3: Left: Plot of the solution to the boundary layer problem p.l4p with e = 10 ^. 
Right: Plot of the Cauchy error in the 2-norm for the solution coefficients. 



nomial of degree 20,003. The exact solution to p.l3|) is the scaled Airy function, 




Letting u{x) denote the computed solution, we have 



iu{x) — u(x)y 



1/2 



1.07 X 10"". 



The magnitude of this L2 error is due to other numerical issues related to the size of 
the approximation and the high oscillations in the true solution, and the Cauchy error 
plot in Figure 13.11 indicates that the solution coefficients themselves are resolved to 
about machine precision, for n > 20,000. In Figure [221 we show numerical evidence 
that the condition number of the linear systems we form are bounded for all n. Later, 
for e = 1 , we show in Figure 15.21 that the derivatives of the solution are also well 
approximated. 

For the second example we consider the boundary layer problem, 

eu"{x) - 2x (^cos{x) - u'{x) + (^cos{x) - u{x) = (3.14) 



with boundary conditions 



m(-1) = u(l) = 1. 



Perturbation theory shows that the solution has two boundary layers at ±cos ""^(.8) 
both of width &{e^/^). In Fi gure 13.31 we take e = 10 The computed solution is of 
degree 15,394, and it is, again, confirmed by the Cauchy error plot that the solution 
is well-resolved. 

For the last example we consider the high order differential equation 

u^-^°\x) + cosh(a;)u(^)(a;) + x^u'^^^x) + x'^u^'^\x) + cos{x)u^^'> (x) + x^u{x) = 



15 




with boundary conditions 

^ u{l) ^ 0, u'(-l) = u'(-l) = 1, M^'^Hil) = 0: 2 < fc < 4. 

This is far from a practical example, and an exact solution seems difBcult (if not 
impossible) to come by. Instead, we note that if u{x) is the solution then it is odd; 
that is, u{x) — — x). Our method does not impose such a condition and therefore, 
we can use it along with the Cauchy error to gain confidence in the computed solution. 
The computed solution u{x) is of degree 55 and plotted in Figure [3^ Moreover, the 
computed solution is odd to about machine precision, 

/ .1 s 1/2 

U iu{x) + u{-x)f] = 1.252 X 10-". 



4. Stability and convergence. The 2-norm condition number of a matrix A S 
£"nxn defined as 

k{A) = \\A\\,\\A-%. 

In numerical experiments, k {An) grows proportionally with n, which is signif- 
icantly better than the typical growth of 0{n^^) in the condition number for the 
standard tau and collocation methods (see section 4.3 of [3]). However, the accuracy 
seen in practice even outperforms this: the backward error is consistent with a nu- 
merical method with bounded condition number. Later, we will show that a trivial, 
diagonal preconditioner results in a linear system with bounded condition number. 
Before that, we state the following lemma that explains the stability we observe in 
practice when solving ill-conditioned systems with Gaussian elimination. In section 
[5] we will present a stable algorithm that can solve the almost banded systems (with 
bandwidth m) with only 0{m?n) operations. 

Lemma 4.1. Suppose D is a diagonal matrix where all its entries are powers of 
the machine base. Then Gaussian elimination with partial pivoting (GEP) applied to 
solve Ac = b is stable if GEP applied to solve ADc{ — h is stable and ||-D||oo = 0(1). 
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Proof. The diagonal matrix D, applied on the right of A scales the columns by 
its diagonal entries and this scaling is done without error. With partial pivoting, the 
choice of pivot is not affected by exact column scaling and hence, the computed unit 
lower triangular matrix is exactly the same in the computed LU decompositions 
of A and AD. Moreover, the permutation matrices are identical. Therefore, the 
right-hand side after applying these matrices is the same, and thus we are solving 

{U + SU){c + Sc) = r + Sr and 
([/ + (5[/)L>(q + 5q) = r + Sr, 

where Sr, SU and Sq are small by assumption. 

We now show the error in backward substitution is small. We first note that 

Cfe = dkQk and Ck + 5ck = dk{qk + Sqk), 

and hence 

6ck = dkSqk 

is small. □ 

If the column scaling is not a power of the machine base then the effect on the 
choice of pivots is hard to quantify [10^. 

4.1. A diagonal preconditioner and compactness. Through-out this sec- 
tion we assume that (x) — 1 (otherwise, divide through by the coefficient on the 
highest order term). We make the restriction that K ^ N; that is, the A^th order 
differential equation has exactly N boundary conditions. When K < N it is more 
appropriate to choose a non-diagonal preconditioner, but we do not analysis that 
situation here. 

We show that there exists a diagonal preconditioner so that the preconditioned 
system has bounded condition number in higher order norms (Definition 14. 2p . For 
Dirichlet boundary conditions the preconditioned system has bounded condition num- 
ber in the 2-norm. Any differential equation with Robin boundary conditions can be 
written as a system of differential equations with Dirichlet conditions, and effectively 
solved with bounded 2-norm condition number. However, this is undesirable because 
(i) reformulating the differential equation is difficult to automate; and (ii) to compute 
n coefficients of the desired solution, linear systems much larger than n x n have to 
be solved. 

Define the diagonal preconditioner by, 

A'' times 

^= 2^^-i(A^-l)!'^'''^(^'---'^' A^' aTTT'-'-)- 

In practice, we observe that many other diagonal preconditioners also give a bounded 
condition number, and it is straightforward to construct such a preconditioner using 
only powers of the machine base. Moreover, it is likely that there are preconditioners 
which give much better practical bounds on the backward error. 

The analysis will follow from the fact that, on suitably defined spaces. 
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for a compact operator /C. To this aim, we need to be precise on which spaces these 
operators act on. Since we are working in coefficient space, we will consider the 
problems as defined in £\ spaces: 

Definition 4.2. C C°° is the Banach space with norm 



U 1,2 



\ k=0 



Lemma 4.3. Suppose that B : Ij^ ^ is bounded. Then 

^n = I + JC 

where K, : i\ is compact for X = D — 1,D, . . .. 

Proof. Note that 

B\ _ /2^-i(^ - .(B- - 

) + [ ; 



SN-iMN-i[a'^^'^]VN-i + ■■■ + Sn-1 ■ ■■SiMi[a^]V + Sn-i ■ --SoMia^ 

where Vn ■ t\ ^ is again the N x oo projection operator (|2.12p . 

We first remark that TZ : il ^ Therefore, BIZ : l\ is bounded for 

A = D — \,D, . . .. Furthermore, we remark that Sk : l\ ^ ^'^'^ bounded for 

A: = 1, 2, . . .. Finally, ■ i\ and it follows that 

Since " ^ U : l\ ^ t\ iov X ^ D l^D, . . . is bounded and 

orr 

are Sn-i, ■ ■ ■ ,Si (since SkR-^^ : t\ ^ i\ are bounded and TZ is compact). It follows 



'■A 

has finite rank, it is compact. Note that TZ is compact as an operator TZ : i\ ^ l\, as 
that the last term 





SN-i---SoM[a''] 



Tz-.el^ el 



is compact, since A^[a°] and Sq are also bounded. Finally, each of the intermediate 

terms are compact since TZ : t\is bounded and Sk are compact. □ 

The compactness of K, allows us to show well-conditioning and convergence. 

{B\ 

Lemma 4.4. Suppose that ( ^ 1 '■ ^a+i ^ ^a invertible operator for some 

X G {D — 1,1?,...}. Then, as n oo, 

\\AM\,2 = 0(1) and |l(A„i?„)"'ll£i = 0(1), 

for the diagonal matrix i?„ — VnTZV^ o^nd truncated spectral matrix 



An — T'n 
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Proof. Since TZ : is trivially invertible, we have that / + /C : €| ^ €| is 

invertible. The lemma follows since K is compact, A^Rn = AnVrJiV^ = I+Vn^Pj , 
and V^'PnK-Vn'Pn converges in norm to / + /C. □ 

4.2. Convergence. We furthermore have convergence, at exactly the same rate 
that the Chebyshcv expansion converges to the true solution. 

Theorem 4.5. Suppose f e ^a-jv+i Z^*" some A e {D - l,D,...}, and that 



£ ) • ^A+i ~^ ^\ '^'^ invertible operator. Define 



T/ien 



|u - K ^nWtl^^ < C\\n - P„ VnuW,.^^ 0. 



Proof. 

Note that ( „ c r 1 ^ ^ 

Oat-i • • • Dot 



Let v„ = i?„ and v = TZ ^u, so that 

v„ = (A„i?„)-ip„(/ + /C)v. 

Moreover, 

P„V = {AnRn)-^Vn{I + /C)PjP„V 

Thus 

-rJ{AnRnr^Vn{I + IC)v 
= {I- rjAAnRn)-^Vn{I + /C))(v - VjVnV) 

Moreover, since ||7e-iu||^2 < -l||u||^2^^ and ||7ev||^^_^, < (iV + l)||v||^2 we have, 

\u-VJu4p^^^ < (iV + l)||v-pJv„||^2 

< (TV + 1) [l + ||(A„i?„)-^ 11,2 (l + miel)] ||v - P>„v|| 



<C\\u-V^Vnu\\ , . 
Lastly, since u e , ^ we know that llu — VjVnu\\p2 — *• as n — )• oo. □ 

-t- "'•A+l 

5. Fast Linear Algebra for Almost Banded Matrices. The spectral method 
we have described requires the solution of a linear system Ax = b where A E C"^". 
The matrix A is banded, with lower bandwidth of mi = 0(m) and upper bandwidth 
of m2 = 0(m). except for the first K dense boundary rows. Here, mi > K and 
1712 > —mi. For simplicity we assume that m2 > 0. The typical structure of A is 



19 




(a) Original Matrix (b) After Left Givens Rotations (c) Upper Triangular 

Fig. 5.1: Typical structure of the matrices while solving spectral systems 



shown diagrammatically in Figure IS.lal Here, we describe a stable algorithm to solve 
Ax — b in @{rn?n) operations and with space requirement 0(mn). 

Since A is nearly upper triangular, apart from mi subdiagonals, a first guess at 
how to solve Ax = b would be to compute a QR factorization by applying Givens 
rotations on the left. However, whether we apply Givens rotations on the left or the 
right, the resulting upper triangular part will be dense because of the boundary rows. 
However, by applying a partial factorization on the left followed by another partial 
factorization on the right, we can obtain a factorization A = QRP* where P and Q 
are orthogonal and R is upper triangular with no more non-zero entries than A. 

We first apply Givens rotations on the left of A to introduce zero entries in the 
mith subdiagonal. These Givens rotations introduce zeros starting in the (rrii,!) 
entry and successively eliminating down the subdiagonal to the (n, n — mi + 1) entry. 
Each Givens rotation applied on the left of A performs a linear combination of two 
neighboring rows. One of the rows contains the non-zero entry to be eliminated and 
the other is the row immediately above this. After the first n — mi Givens rotations, 
the resulting matrix has a zero mith subdiagonal and, typically, non-zero entries 
along the (r7i2 + l)th superdiagonal have been introduced. In the same way, Givens 
rotations are then used to eliminate, in turn, the (mi — l)th, (toi— 2)th, . . ., (ii' + l)th 
subdiagonals. The typical structure of the resulting matrix is shown in Figure IS.lbl 
and has lower bandwidth of K and an upper bandwidth of m2 + mi — K, except for 
the first K non-zero boundary rows. 

In a similar fashion, we now apply Givens rotations on the right. The first se- 
quence of Givens rotations on the right eliminates the Kth subdiagonal by starting in 
the last row and successively eliminating to the Kth row. Each Givens rotation applied 
on the right performs a linear combination of two neighboring columns. One of the 
columns contains the non-zero entry to be eliminated and the other is the column im- 
mediately to the right. After the first n — K Givens rotations the resulting matrix has 
a zero Kth subdiagonal and, typically, non-zero entries along the (rn2 + 'tii — K + l)th 
superdiagonal have been introduced. Givens rotations are then used to eliminate, in 
turn, the {K — l)th, {K — 2)th,. . ., 1st subdiagonals. The resulting matrix is upper 
triangular, with upper bandwidth of mi +m2 (except for the first K rows) and shown 
diagrammatically in Figure IS.lcl Each Givens rotation, whether applied on the left 
or right, eliminated one zero in a subdiagonal and introduced, at most, one non-zero 
in a superdiagonal. In particular, the upper triangular matrix has no more non-zeros 
than the original matrix A. 

The factorization can be written as ^ = QRP*, where Q,P G C"^" are orthogo- 
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nal and R e C"X" jg upper triangular. To reduce storage requirements we overwrite 
A with Q, R and P as we perform the factorization using the same trick as described 
by Demmel fB^. Note that, if we were computing Ax = b just once, we would not store 
any information about Q, because we can apply the left Givens rotations directly to 
the vector b, as we go. However, in practice we progressively increase n, until the 
solution of the differential equation is well-resolved, and in this case we are able to 
reapply the left Givens rotation used on the smaller system to the new larger system. 
Solving the system Ax = b is computed with the following four steps: 

1. Factorize A = QRP* by applying left and right Givens rotations. 

2. Compute Q*b. 

3. Solve Ry = Q*b for y by backwards substitution. 

4. Compute x = Py. 

The factorization requires 0{mn) Givens rotations with each one performing a 
linear combination between 0(to) non-zero entries. Hence, this factorization can be 
computed in 0(m?n) operations. Backwards substitution takes 0{m?n) operations, 
since R is banded except for the first K = 0(1) rows. Moreover, Q*b and Py are 
computed by applying 0(mn) Givens rotations to vectors and hence, can be computed 
in 0{mn) operations. In total, the almost banded linear systems that our spectral 
method forms can be solved in @{m^ri) operations. 

Storage requirement is asymptotically minimal since A has 0{mn) non-zero en- 
tries, and R has no more than A. The total space requirement is 0(TOn), which is 
asymptotically the same as storing the original matrix A. 

5.1. Linear algebra stability in higher order norms. We first remark that 
if S is a bounded operator from if ig, such as Dirichlet boundary conditions, the 
results of Section [4] prove that the preconditioned linear system has bounded 2-norm 
condition number as n — oo. Because the QRP* decomposition is computed using 
Givens rotations, which are stable in £q [TO], as is backward substitution, we see that 
the linear algebra scheme applied to the preconditioned operator is stable, and has 
0{m^n) complexity. We remark that a variant of Lemma |4. II can be proved showing 
that this stability property holds for the non-preconditioned version as well. 

The results of Section |4] also show convergence and well-conditioning in higher 
order norms. One would expect numerical round-off in QRP* decomposition to de- 
stroy this convergence property. However, in practice, this is not the case. In Figure 
15.21 we solve the standard Airy equation as a two point boundary value problem: 
u" = xu,u{—l) = Ai(— — Ai(l). We witness convergence in higher order 
norms as well. In other words, the computed solution has a fast decaying tail, and all 
of the absolute error is in the low order coefficients. 

Convergence in higher order norms implies convergence of derivatives, and this is 
verified by differentiating the computed solution by applying Sq^Di repeatedly. (We 
note that the banded, upper triangular nature oiSo means that its inverse applied to a 
vector is computable in 0{n) time; after all, this is precisely the classical differentiation 
operator for Chebyshev expansions.) We compare the computed derivatives with the 
true derivatives of the Airy function Ai*-^-* at a single point, x — —1/2, and see 
convergence for p = 1,5 and 20. 

We note that the stability of the QRP* algorithm in higher order norms appears 
to follow from Q being almost banded: its banded along the superdiagonal, and decays 
spectrally along the subdiagonals. We will not attempt to prove this here as it seems 
likely that it only holds true generically; hence, a differential equation could possibly 
be constructed which breaks this property. 
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Fig. 5.2: Left: Plot of the Cauchy error of the solution coefficients measured in the: 
(solid) ^f-norm, (dashed) ^g-norm, (dot-dashed) i?20"iio''iT^- Right: Plot of the relative 
error in derivatives of the solution at a; = — ^ for the (solid) 1st, (dashed) 5th, (dot- 
dashed) 20th derivative. 



Since £| is a Hilbert space, Givens rotations can be constructed with the relevant 
inner product, resulting in orthogonal operations (i.e., with condition number one) in 
£\ . The stability of such an algorithm in £^ follows immediately. With this modifica- 
tion, we have an 0{m^n) stable algorithm which is guaranteed to converge in higher 
order norms. 

Remark While we have discussed the convergence in higher order norms with 
Dirichlet boundary conditions, the exact same logic applies to the convergence and 
stability observed with higher order boundary conditions. 

6. Future work. A straightforward extension of this work is to exotic boundary 
conditions. For example, we can impose a condition on the value of the solutions 
integral using the boundary operator 

(Aio,Mi, • • •) 

where fik are the Clenshaw-Curtis weights [5], computable in 0(?T.logn) time [22] . 
Other boundary conditions, such as Robin conditions or conditions involving values 
or derivatives inside (—1, 1), are equably imposable. 

One item for future work is to investigate the method on linear integro-differential 
equations, such as 

a{x)u'{x) + b{x)u{x)+ j c{s)u{s)ds ^ f{x). (6.1) 

In order to preserve the almost banded structure of the linear system the solution 
u{x) could be represented in a series; instead of a Chebyshev series. The integral 
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recurrence relation 

Un[s)ds = — — — , n > 

I n + 1 n + i 

means that the integration operator can be represented by 

1 1 



(I -1 i ...^ 

1 



Ti = 



Similar, almost banded systems can be constructed and used to solve (|6.ip . 

A second extension of this work is to nonlinear differential equations, of the form 

Bu = and hu + g{u) = f. 

It is straightforward to incorporate the approach of this paper into an infinite-dimensional 
Newton iteration, as in [8]. The Newton iteration takes the form 



Uk+l ^Uk + 



^ y ( " 

g'iuk) J yj^uk + g{uk) - / 



Since the linear operator we invert involves the solution itself, it is unlikely that 
the handedness of its discretization for large n would be useful, hence the resulting 
algorithm would not be faster than what exists. However, the well-conditioning of 
the linear operator is preserved, and would likely translate to better accuracy of the 
resulting algorithm for nonlinear problems. 

Lastly an exciting generalization of this work would be to higher dimensions, 
where the density of matrices has inhibited the usefulness of spectral methods. Con- 
structing a method on rectangular domains should be straightforward, using tensor 
products of ultraspherical polynomials. Indeed, a similar approach for Helmholtz 
equation (i.e., a constant coefficient PDE) was used in [TH]. Using the theory of [T7] . 
there are potentially generalization of ultraspherical polynomials to deltoid domains. 
What is less clear is how the results would be generalizable to more general domains, 
in particular, to a triangle. 
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